Our dataset consists of historical stock prices over the last 12 years from Apple (APPL). Our data comes from Kaggle (https://www.kaggle.com/szrlee/stock-time-series-20050101-to-20171231). The variables are date (Date), price at market open (Open), highest price of the day (High), lowest price of the day (Low), price at market close (Close), number of shares bought and sold (Volume), name of the stock (Name).
The goal of this project is to become familiar with the theory of time series and make predictions with a time series. In this project, we explore trend and seasonality elimination methods, ARMA models and GARCH models for predicting a financial time series
Having the daily stock prices of Apple over 12 years, we can study our data at several levels of aggregation. In our study we will consider two levels:
1 - The average prices per month over the 12 years
2 - Daily prices over one month
Looking at the auto-correlation functions of the data, we see that the series cannot be reasonably modeled by a stationary series. The ACF values are all outside the confidence interval, which does not correspond to the ACF of a stationary series.
We notice that the series does not behave in the same way depending on whether we look at the 12-year average or the daily data over one month. For level 1, the data seem to grow linearly. They can potentially be approximated by a linear regression of degree 1. Level 2 seems to have a more periodic behavior. We will now remove the trend and seasonality from the data.
We want to know if the residuals can be modeled by a stationary series.
In the previous section we analyzed the pattern of the data. We have seen that the data at level 1 grow in a rather linear way. And the data at level 2 have a rather periodic behavior. We explore the possibility of representing the data as a realization of the Classical Decomposition Model.
where represents the trend, the seasonality with a known period and a stationary random noise We will test two methods of estimating and eliminating the trend and seasonality.
In this section, we will use a moving average to estimate the trend.
$$
where . Using we get an estimate of .
We will take the log of the data because we use a linear model.
After computing the residuals , we obtain the following autocorrelation function for 40 lags. We see that most of the values after 2 lags remain in the confidence interval.
Unit Root test to test stationarity of the differentiated series
We can apply a unit root test (or Dickey-Fuller test) which allows us to know if the obtained residuals are stationary or not.
The p-value of the test is 0.01, which means that we can reject the null hypothesis at a 95% confidence level. Therefore, the residuals obtained by MA of the series can be modeled by a stationary series.
In this method, we introduce the lag-d operator: . By applying the lag-d operator to equation (1), we have:
We have the decomposition of the difference as a function of trend and noise. From there we can eliminate the trend by applying the operator : . We use the diff function from R to apply this method.
We plot the auto-correlation function of the differentiated series.
Several values of the ACF are still not in the interval. Then the residuals are not IID.
Unit Root test to test stationarity of the differentiated series
We run a Unit Root test and we obtain a p-value of 0.01, which means that we can reject the null hypothesis at the 95% confidence level. Therefore, the residuals obtained by differentiation of the series can be modeled by a stationary series.
In conclusion we have made our time series stationary thanks to the methods by differentiation and by Moving Average.
In this section we want to determine if we can find an ARMA model that models our data reasonably well.
ARMA (Autoregressive Moving Average) models explain the relationship of the data series with random noise (the Moving Average part) and with its prior values (the Autoregressive part).
Mathematically:
Using R, we manage to find the ARMA model that best fits our series.
It turns out to be an ARIMA(0,1,1) model, i.e. a model such that the series differentiated once from our data is an ARMA(0,1).
This is consistent with our results above because we have seen that by differentiating the series once we had a stationary series. And this series is well modeled by an MA(1) process.
Using R for the daily price data over a month, we see that these data are better modeled by an ARMA(1,0) model.
This means that the daily prices seem to be better explained by past realizations while the averages are better explained by noise.
In this section we will predict the monthly average values of Apple’s stock with an ARIMA(0,1,1) model. We will estimate the error of the prediction through the root mean square error (RMSE).
We first divide our dataset into a training sample and a validation sample (test sample) with the ratio 70/30.
We consider the monthly averages
The RMSE obtained for this prediction is:
This result is not very satisfactory. In the following we will try to see how we can improve it.
In this section, we will explore the GARCH models to see if this class of models can make a better prediction of our data. The ARCH and GARCH (Generalized & Autoregressive Conditional Heteroscedasticity) models were developed to reflect the properties of financial time series. These properties include skewness, volatility, and uncorrelated serial dependence. These properties cannot be captured with traditional linear models such as ARMA.
The ARCH and GARCH models are written as follows:
a stationary process such that:
These models are applied on the log returns of stock prices at closing i.e. where is the stock price at closing because we notice that log returns tend to be stationary.
We start by transforming our data into log returns.
Log returns visualisation
We notice in the graph above that the series does indeed look stationary.
In the ACF function above, we see that the values are very close to 0. While in the ACF of squares, the values are significantly different from 0. This implies that we have a series of log returns where the realizations are uncorrelated but dependent. The ARCH and GARCH models include this dependence with the term (volatility).
Log returns distribution
We see that the tails of the distribution of returns are larger than those of the normal distribution (in green). This means that the returns are not normally distributed: we can observe very low returns and other very high returns depending on the day.
Building a GARCH model
We construct a GARCH model that assumes a constant mean of 0, a standard GARCH model of normal returns (equation (3)). We obtain a GARCH(1,1) such that model:
The Ljung-Box statistical test for correlation tells us, at a 95% confidence level that there is no correlation of the data (as noted above with the ACF). This test is an argument for the validity of our GARCH model. On the flip side, Pearson’s test for the “Goodness of fit” rejects the null hypothesis which assumes that the residuals are normal. This means that we have room to improve our model at this level.
We divide our data set in 2 parts (ratio 70/30), with the training sample and the validation sample.
We apply the GARCH(1,1) model on the training data and from there we make a prediction for the values of the validation sample. As before, we measure our error with the RMSE.
The graph above shows the predicted and current values of the log returns between 2014 and 2017. The prediction is rather close to the true values. We obtain an error of:
Visualization
To better understand the accuracy of our model, we can re-express the log returns in terms of prices and compare the predicted trajectories and the true trajectory.
It is possible to improve our model. Either by giving it more training samples or by using a GARCH that does not assume that the residuals are normally distributed because we have seen with the Pearson test that this is not the case.
In this section, we will explore the EGARCH model which is a modification of the GARCH model. As mentioned above, GARCH is a model that was developed to reflect the properties of financial time series. EGARCH is a less restrictive model than GARCH, it does not assume that log returns are Gaussian and does not force the coefficients of the conditional variance to be positive (i.e. is asymmetric). This has the effect of incorporating the following stylized facts:
The distribution of financial data has thick tails
Negative shocks at have a stronger impact at than positive shocks
The model is written as follows:
Where , are real and and has a symmetric distribution in 0.
This model improves the RMSE error. To go further, we can also try to change the mean of the GARCH model for a better fit. It is important to remember that in order to really benefit from GARCH models other than the standard GARCH, it is necessary to check whether the data with which we work verify the stylized facts of the financial data on which these models are based.
[1] Introduction to Time Series and Forecasting, Peter J. Brockwell, Richard A. Davis
[2] VLab NYU, Volatility Analysis, EGARCH, https://vlab.stern.nyu.edu/docs/volatility/EGARCH
[3] Medium, A complete introduction to time series analysis, https://medium.com/analytics-vidhya/a-complete-introduction-to-time-series-analysis-with-r-differencing-db94bc4df0ae
[4] Dr Bharatendra Rai, https://www.youtube.com/user/westlandindia